Phenotypic diversity in qualitative and quantitative traits for selection of high yield potential field pea genotypes

Field pea (Pisum sativum L.) needs improvement to increase productivity due to its high price and demand. However, the incidence of powdery mildew (PM) disease limits its production. This study aimed to analyze the diversity of qualitative and quantitative traits against powdery mildew resistance by utilizing cluster and principal component analysis to explore PM resistance high-yield potential field peas. Shannon–Weaver's diversity index (Hʹ) displayed high intra-genotype diversity for quantitative and qualitative aspects. Heterogeneity was identified for resistance against powdery mildew infections. Eighty-five genotypes were divided into five groups using Mohalanobis generalized distance (D2) statistics. The highest inter-cluster D2 value was observed between clusters 2 and 3 (11.89) while the lowest value was found between clusters 3 and 4 (2.06). Most of the genotypes had noticeable differences, so these could be employed in a crossing scheme. Twelve genotypes were extremely resistant, 29 genotypes were resistant, 25 genotypes were moderately resistant, 18 genotypes were fairly susceptible, and 1 genotype was susceptible to powdery mildew disease. Among 29 resistant genotypes, BFP77, BFP74, BFP63, BFP62, BFP43, and BFP80 were high yielders and, could be used directly and/or transferred through hybridization to high-yielding disease-susceptible genotypes. Among the 25 moderately resistant genotypes, BFP78, BFP45, BFP79, and BFP48 were found to be high yielders. In principal component analysis (PCA), the first four PCs with Eigen values > 1 accounted for 88.4% variability for quantitative traits. Clustering sorted genotypes into five groups, where groups 1 to 5 assembled 37, 28, 1, 8, and 11 genotypes, respectively. Genotypes of cluster 4 were identified as high yielders with its attributes. Pearson correlation significantly and positively correlated across all traits except for PM. This variation suggested that there is a mechanism to select promising genotypes for field pea breeding. Considering all features, BFP78, BFP77, BFP74, BFP63, BFP62, BFP45, BFP79, and BFP80 could be preferred as high yielders and PM resistance owing to longer pod lengths, seeds per pod and pods per plant.


Designing experiments and crop management
The field experiment was carried out during the cropping season (November to February) of the years 2022 and 2023 at the field of the Pulse Breeding Division, PRC, BARI, Ishwardi, Pabna, Bangladesh, which is situated at a mean height of 15-19 m above sea level and is located at 24.75°N latitude and 88.5°E longitude.A Randomized Complete Block Design (RCBD) with three replications was used to set up the research.The unit plot for continuous line sowing was 4.0 m by 1 m (two lines).Plant-to-plant spacing was kept at 5-7 cm inside a row.The use of pre-sowing irrigation helped achieve optimal germination.The experimental plots were well-prepared before planting, and we added Farm Yard Manure (FYM).Field pea cultivation was done using the recommended amounts of manure and fertilizer 71 .The seeds were treated with Provex at 2.5 g/kg seed prior to sowing.The experiment was routinely weeded, and irrigation was given as needed.As and when required, other cultural activities were carried out.

Determination of agro-morphological and qualitative characteristics
Ten competing plants were chosen at random from each entry in each replication, and the observations were averaged by dividing each value by ten.Data were recorded on days to 80% flowering (DF), days to maturity (DM), plant height (PH), pod length (cm) (PL), seeds per pod (SPP), pods per plant (PPP), hundred seed weight (HSW), and yield per plant (YPP).Data on grain yield and yield-related traits were collected on a plant basis.The days to flowering (DF) and days to maturity (DM) were taken when each block reached 80% flowering and 90% physiological pod maturity and calculated from the time required to the date of sowing to the date of 80% flowering and 90% physiological pod maturity, respectively.Plant height (PH), pod length (PL), seeds per pod (SPP), pods per plant (PPP), and yield per plant (YPP) were collected after harvesting from ten randomly selected plants in each plot.Plant height was measured in cm from the base of the plant to the tip of the plant.Pod length was measured from ten randomly selected pods using a scale in cm.The number of seeds in ten randomly selected pods was collected to measure number of seeds per pod.The number of pods in ten randomly selected plants was calculated to measure pods per plant.A hundred dried and cleaned seeds were counted and weighed to measure hundred seed weight (HSW).Ten randomly selected plants were collected and the seeds were separated, dried, and cleaned to measure yield per plant.A set of local descriptors that were produced with the help of several scientific descriptors from the European Union descriptor (UPOV) and IBPGR were used to capture the qualitative characteristics 72,73 .

Disease data scoring
Early stage, flowering stage, and pod setting stage disease reactions of powdery mildew for various genotypes were documented on a full plot basis 60 days after planting at three times 74 (Table 2).After seed germination, 10 plants were selected at random from each line and scored for morphological data and powdery mildew.The development of the powdery mildew disease was started by natural infection, and the disease infection was created by spraying a fungicide Benomyl at 2.5 kg/ha at a fixed spray interval of every 7, 14, and 21 days 75 .Using a knapsack sprayer with a 60.6 ml spray volume per 2.4 m 2 plot, a fungicide was administered.The application of fungicides began as soon as the first detectable disease symptom appeared.The disease scoring process used a scale of 1 to 9 are details presented in Table 2.According to Singh 76 the disease data collected using the aforementioned scale was translated to a percentage disease index (PDI).Percentage disease index (PDI) was calculated for each genotype using the formula: www.nature.com/scientificreports/

Statistical analysis
Both quantitative and qualitative characteristics were taken to evaluate the varieties.11 qualitative and 8 quantitative measurements were combined.Five plants and five pods per plant per accession were used to gather the observations.Each treatment from all the sample data of a trait was averaged to obtain a replication mean.Statistics and biometrics were used to assess the average data of different quantitative traits.The ANOVA, descriptive www.nature.com/scientificreports/statistics, and LSD were calculated by using Statistix 8. Microsoft Excel was used to assess the phenotypic diversity for each qualitative feature using the Shannon-Weaver diversity index (Hʹ) 77 .Excel was used to generate the Standardized H' using the following formula 77 : where Hʹ-Shannon diversity index, and n-Individuals of a certain kind or species, N-Total number of individuals of a community.According to Eticha et al. 78 , the diversity index was categorized as low (0.Using the doe-bioresearch packages, the findings for various agro-morphological parameters were evaluated.The means were separated using the least significant difference at a 5% level of significance.Pooled data from two years (2022 and 2023) were used to perform all analyses.The tools R (v 4.0.5) and R Studio were used to conduct the multivariate analysis 79 .Principal component analysis (PCA) was used to estimate the degree of connection between characteristics, and cluster analysis was used to group genotypes based on traits.Using R's Complex Heatmap package, the two-way hierarchical clustering heatmap was created using the Ward D 2 and Euclidean distance algorithms.The R packages ggplot2, Factoextra 80 , and FactomineR 81 , were used to create the PCA-biplot.The R program corrplot was used to create the correlation matrix 82 which was then arranged in hclust.Cluster analysis was used to determine the cluster means and standard deviations for each characteristic.Cluster analysis assessed the cluster mean values, distance values, and Dendrogram; PCA calculated Eigenvalues, variability, cumulative variability, and vector components.

Qualitative traits
The qualitative traits of the tested field pea genotypes are presented in Table 3, and the conversion values of the individual traits are exposed in the same Table 3. Flower color variation and variability of leaf and tendril characteristics are shown in Fig. 1.Our results showed significant variation among the traits studied in the field pea genotypes.The leaves were green in 69 genotypes (76.47% of the total), and the remaining 12 genotypes had yellow-green leaves (14.12%).It was found that the color of the tendrils of the collected genotypes was mainly green (56.47%), followed by pale green and purple green tendrils 36.47, and 7.06%, respectively.High leaf sizes were found in medium (43.53%), small, large, very small, and very large (43.53%), 29.41, 16.47, and 3.53%, respectively.Most genotypes of flowers (59 of 85) were creamy white (69.41%).Great diversity in immature pod color was identified.Green, light-green, slight dark green, and slight light green constituted the major types with 56.47, 16.47, 14.12, and 10.59% of the total variation, while the remaining genotype had dark-green colored pods (2.35% of the total).Maximum twining tendrils showed intermediate type (60%) and the lowest found high twining tendrils (16.47%).Two types of growth patterns were recognized in the evaluated genotypes: erect (74.12%) and flat (25.88%).Rough-type pod texture was recorded among the collected field pea genotypes, followed by smooth and tuberculate types.The majority of genotypes had weakly curved pods (72.94%) and a little absent pod curvature (14.12%).Regarding seed size, three types, medium (72.94%), small (20.00%), and large (7.06%)-were recorded from the field pea genotype.The seed colors of the collected genotype are mainly Whitish green (51.76%) and Cream (29.41%).In the case of the seed wrinkle, two types of variation were noted.The frequency distribution of genotypes studied for quantitative traits is shown in Fig. 2. In the histogram, there was much variation found in the evaluated genotypes among the tested traits (Fig. 2).

Genotypic variations in agro-morphological traits
The combined analysis of variance over two years data revealed significant differences for studied traits among the tested field pea genotypes (Table 4).However, the analysis of variance for year showed non-significant differences among the tested genotypes (Table 4).For this reason, the data of years 2022 and 2023 were pooled and the pooled data were analyzed for diversity study and other multivariate analysis.The results of the wide range of parameters for mean performance indicated that significant variations were seen in all of the investigated features, particularly in yield, seed size, pod setting, and disease response.www.nature.com/scientificreports/for each characteristic in the 85 genotypes of field peas that were studied.For genotypes BFP30 and BFP56, the DF was found to have a mean of 54.88 days and a range of 25-91 days, respectively (Tables 5 and 6; Supplementary Tables S1 & S3).The DM values, which varied from 84 to 112 days (BFP84 to BFP57 and BFP61), had grand averages of 103.14 (Tables 5 and 6).From the vegetative stage until maturity, the plant's height ranges from 11.94 to 219.62 cm, with a mean height of 113.68 cm (Tables 5 and 6; Supplementary Tables S1 & S3).The PPP substantially distinguished across genotypes, with BFP65 (6.3) and BFP74 (47.78), respectively, having the highest and the lowest values, while the mean was 21.13 (Tables 5 and 6; Supplementary Tables S1 & S3).The SPP and PM had grand means of 4.87 and 3.71, with a range of 5.32 to 6. BFP04 had the highest PL (6.14), followed by BFP55, BFP65, and BFP14 (Tables 5 and 6; Supplementary Tables S1 & S3).Between genotypes, significant variance in HSW was displayed.The lowest HSW was shown by the BFP50, while the highest HSW and mean were shown by the BFP44 (Tables 5 and 6; Supplementary Tables S1 & S3).Field pea genotypes' YPP varied from 0.75 to 26.16 g.BFP78 provided the greatest YPP, followed by BFP77, BFP74, BFP72, BFP45, and BFP63.www.nature.com/scientificreports/BFP77, BFP74, BFP72, and BFP63 were highly resistant to resistant and high yielding.The genotypes BFP78, BFP79, and BFP48, on the other hand, were highly productive and only moderately resistant.BFP44, however, demonstrated high production potential and moderate susceptibility (Table 7 Supplementary Tables S2 & S4).

Shannon Weaver Diversity analysis combined with descriptive statistics
The diversity of the accessions on the quantitative characteristics shown in Table 6 was assessed using the descriptive statistics (average, range, and standard deviation) and Hʹ.All of the genotypes' coefficients of variation fell into two categories: medium (between 10 and 20%), or low (below 10%).The ninequantitative features' coefficients of variance varied from 1.04 to 12.15%.The YPP population had the highest CV (12.

Response of genotypes to the illness of powdery mildew
The genotypes of field peas were tested in the field at three development stages for natural infection with the powdery mildew disease caused by Erysiphe polygoni.From the early stages to flowering and pod setting, the disease's severity increased (Table 7, Supplementary Tables S2 & S4, and Fig. 3).The responses of all genotypes examined to the illness caused by powdery mildew varied greatly.As a result, among the genotype sources, 14.11 and 34.11% of the genotypes, respectively, were found to be highly resistant to E. polygony infection.Hence forward, it was found that out of the total 85 field pea genotypes, twelve genotypes (BFP21, BFP29, BFP30, BFP31, BFP33, BFP34, BFP35, BFP52, BFP72, BFP73, BFP81 and BFP84) i.e. 14.11% were highly resistant (Disease Severity Scale 2) (      www.nature.com/scientificreports/Severity Scale 7) (Table 8 and Fig. 3).The genotypes BFP21, BFP30, BFP31, BFP33, BFP34, BFP52, BFP72 were highly resistance along with check variety BFP84 (BARI Motor-2).

Principal component analysis (PCA)
The results of PCA showed that only the first four principal components (PCs) had eigenvalues greater than 1.00 and that the highest variability across field pea genotypes for yield component attributes was around 88.4% (Fig. 4a,b).In field pea improvement programs, the traits corresponding to these five PCs may be given the www.nature.com/scientificreports/appropriate weight.Nine characteristics were used for the principal component analysis (Fig. 4b).72.2% of the variance was explained by the first three principal components (PC), which had values of 30, 23.9, and 18.3% for PC1, PC2, and PC3, respectively (Fig. 4a).Following the PC1 in terms of variety were the PC2, PC3, PC4, and PC5.The earlier research by Hanci and Cebeci 83 provided further support for the current investigation.The decision of how many variables to keep is aided by eigenvalues.According to Sharma 54 , the number of variables is often equal to the sum of the eigenvalues.The first principal component (PC1) exhibited a variance of 30%, with the key positive contributions being YPP, HSW, PH, and PPP, while the significant negative contributors were PL, DM, and SPP (Fig. 4c).The PC2 was mostly linked to yield parameters including DF, DM, and PPP, and accounted for 23.9% of the overall variance (Fig. 4d).
Figure 5 depicts the magnitude and direction of the quantitative features' contribution to the various main components.The PCA-biplot was created using the first two PCs, which together accounted for 53.9% of the total variability (Fig. 5).The key yield and yield-attributing traits, however, grouped in trait clusters 2 and 3, such as HSW, PL, and YPP, were mostly connected with and positively contributed to PC2 and PC3 (Figs. 4c,d,  5).Based on the features that contribute to yield, the genotypes in clusters 4 and 5 dominated substantially and were favorably emphasized in both PC1 and PC2 (Fig. 5); in contrast, the genotypes in cluster 3 were strongly reflected by the traits DF, DM, and PL (Fig. 4).

Analysis of cluster distances
Based on D 2 values, cluster analysis was used to divide the 85 genotypes into 5 distinctive groups.The number of genotypes in each cluster ranged from 1 to 37. Cluster 1 included the greatest number of genotypes (37) possible.Cluster 3 included the fewest number of genotypes only one.The genotypes in clusters 2, 4, and 5 were 28, 8, and 11, respectively (Table 9 and Fig. 6).Prasad et al. 1 and Kumar et al. 84 noticed that great variety was present in the material under evaluation, as evidenced by the discrimination of genotyping lines into so many distinct clusters.Table 9 and Fig. 6 provide estimates of intra-and inter-cluster distances for five clusters.Cluster 5 had the highest intra-cluster value (3.40), followed by Cluster 2 (3.01),Cluster 1 (2.92), and Cluster 4 (2.06), indicating that these clusters' genotypes exhibit substantial genetic diversity (Table 9).Cluster 2 and Cluster 3 had the greatest inter-cluster distance (11.89), followed by Cluster 1 and Cluster 3 (11.43)and Cluster 3 and Cluster 5 (5.35), indicating the greatest genotypic diversity in these clusters.Therefore, it is suggested that improved segregants for high seed production and yield-contributing characteristics owing to non-allelic interactions are predicted if different genotypes from these groups are employed in breeding programs along with other desired features.Cluster 3 and cluster 4 exhibited the least genetic diversity among their clusters, and they had the same genetic architecture, as shown by the smallest inter-cluster difference between them (2.06), which was followed by cluster 4 and cluster 5 (3.40) (Table 9).In order to disrupt the unfavorable relationship between yield and its associated qualities, such genotypes may also be employed in breeding programs to create bi-parental crosses between the most diversified and close-proximity groups.

Genotypes are grouped using a heatmap-oriented clustering pattern
A heatmap is a two-dimensional data visualization technique that uses color to show the size of a phenomenon.By examining color variation by intensity, the reader may observe how the phenomenon is categorized or varies through time.On a backdrop of mostly minor features, it depicts the relative distribution of strongly expressed qualities (Figs. 6 and 7).The heatmap analysis produced two dendrograms as a consequence, one in the vertical direction representing the germplasm accessions and the other in the horizontal direction reflecting the attributes that caused the diffusion.A heatmap is a two-dimensional data visualization technique that uses color to show the size of a phenomenon.Eighty-five field pea genotypes were used in the current research, and using heatmap-oriented cluster analysis, the genotypes were divided into five groups based on the average values of all the analyzed variables.The distribution of genotypes in the clusters showed that cluster 1 had the highest number of genotypes (37) and cluster 3 had the lowest number of genotypes (1) (Table 10 and Figs.6 and 7).Three other groups might be seen on other dendrograms.DF and DM are two characters connected to Group 1. Three characters (YPP, HSW, and PL) are related to Group 3. Three characters (SPP, PH, and PPP) are allies of Group 3 (Fig. 7).

Analysis of clustered means
To determine the acceptable genotypic diversity present across all study groups, a dendrogram of 85 field pea genotypes was created using the Ward clustering technique 85 .The significant degree of variation in cluster means for several traits (Table 10) further supported the variety.Eight yield and yield contributing characteristics, along with PM cluster means, were evaluated (Table 10).The average comparison of the various characters revealed significant variations among the clusters for each character.Cluster 5 had the highest mean for DF (81.45), followed by Cluster 3 (57), while Cluster 2 had the lowest mean for a DF (47.04).The mean values for Cluster 3 were the greatest for DM, HSW, and PM.In cluster 4, the greatest means for PH (190.05),PPP (38.03), and YPP (20.65) were found.Twenty-eight genotypes constituted the Cluster 2, which had the second highest number of genotypes.This cluster had smaller seeds than the others, was more moderately vulnerable to powdery mildew, and had a lower YPP than the others.In cluster 2, none of the characters had the highest mean value.These results showed that certain clusters performed better for various character types.www.nature.com/scientificreports/

Analysis of the trait associations
There were very strong relationships among the traits that were assessed (Fig. 8).The coefficient of correlation is the measurement of the linear relationship between two variables.The correlation of nine parameters under field conditions is presented in Fig. 8. PPP had positive significantly correlation with PH (r = 0.65, p < 0.001), YPP (r = 0.46, p < 0.001), whereas it had negative nonsignificant correlation with DF, PM and SPP under field condition.PH demonstrated positive significant correlation with YPP (r = 0.34, p < 0.01), and DM (r = 0.24, p < 0.05) while expressed negligible negative correlation with DF, PM and SPP.YPP expressed positive highly significant correlation with HSW (r = 0.85, p < 0.001), while it showed negligible negative correlation with DF and PM.DF showed highly significant correlation with DM (r = 0.54, p < 0.001).Powdery mildew (PM) disease severity had nonsignificant positive correlation with HSW (r = 0.03) and SPP (0.15).whereas rest of the traits had a nonsignificant negative correlation with powdery mildew disease severity (Fig. 8).PL represented highly positive correlation with SPP (r = 0.80, p < 0.001).

Discussion
An established crop breeding technique for managing and successfully using plant genetic resources is genotype evaluation and screening for desired traits 86 .The degree of genetic diversity in agro-morphological variables associated with yield determines the breeding strategy.Research on multivariate analysis and genetic diversity is essential for an effective genotype assessment.Plant genotypes exhibit a great degree of morphological variation.A great degree of diversity is used to create better cultivars of important crops.Crops that are less popular     or aren't used as much need to benefit from this development as well since they have a high degree of variability both inside and between their accessions (intra-variation) and within (inter-variation).All studied field pea genotypes showed a variation across several traits which define a wide array of variability among the traits.A similar kind of variability for traits in rice was recorded by several researchers 87,88 .Breeders may choose better lines for future development by using morphological characterization in diverse areas [89][90][91] .Analyzing morphological traits is a common method for determining genetic diversity for many crop species, including field peas.It is successfully used on a variety of crops, including Pisum sativum 73 , mungbean 48,92 , black gram 93 , amaranth [94][95][96][97][98] , Maize 99 , and field pea 17,22,23 .
The findings of this research support highly the Shannon-Weaver diversity indices across field pea populations for the qualitative attributes of tendril color, tendril twinning, immature pod color, flower color, and seed size.Shubha et al. 100 and Rosero-Lombana et al. 101 both found similar findings for field pea.Table 3 displays the results of the Shannon-Weaver diversity study for qualitative traits.Indicators of variety varied from 0.36 for leaf color to 0.89 for pod curvature.Seven qualitative traits in this research have a high variety index (H = 0.60).The results of this research showed that there was little variation in leaf color and pod curvature across field pea genotypes, in contrast to studies 73,102 .For the majority of the characteristics listed in Tables 4 and 5, a broad range of differences were observed.The findings are validated by other research 17,73,103 and show that features like PH, HSW, DF, PPP, DM and YPP, showed a very wide variation in mean performance.There was a lot of phenotypic diversity in seed yield and related traits.The great degree of diversity in yield and its associated traits was highlighted by the mean performance in this research (Table 3), suggesting that future breeding programs will have more opportunities to make use of these traits 104 .In keeping with the findings of the current study, several researchers including those of our own earlier study 16,17 showed significant heterogeneity in field pea yield and its associated characteristics.Significant differences for studied traits among the tested field pea genotypes might be due to the difference in genetic composition of the tested genotypes.This indicates that the tested genotypes have different potential for field pea crop production for studied characters which corroborated with the results of Mogiso 105 and Gurmu 106 in field pea.Significant variations across genotypes were also reported in literatures [107][108][109][110][111][112] which supported the current findings.Crops that are affected by the powdery mildew disease have large yield losses 74,113,114 .Field pea breeding for resistance to powdery mildew needs effective disease screening techniques.When field peas are produced for seed, powdery mildew losses are greater because the disease becomes more severe as the crop matures 114 .No matter where they came from, natural sources of resistance with varying disease responses to powdery mildew in field peas have been found in the germplasm 32 .Crops suffering www.nature.com/scientificreports/from powdery mildew disease experience significant losses of yield 114 .It is necessary to use the right diseasescreening techniques while breeding field peas to resist powdery mildew.According to Teshome 75 , the powdery mildew disease significantly reduces the yield potential of field pea germplasm grown throughout the world by generating an 86% loss.Therefore, the greatest alternative for crop breeding is genetically based resistance to harmful diseases 32,75,115 .In order to choose powdery mildew-resistant lines for documentation and identification of top lines for widespread cultivation, the chosen pea lines were also assessed for the powdery mildew disease.
Researchers have used a variety of approaches to screen for powdery mildew 29 , 75,115 , but artificial inoculation has been found to be the most reliable and effective 29,105,115,116 .In this study, the same eighty-five germplasms were tested in a field setting for resistance to powdery mildew.Out of the 85 germplasms listed in Tables 6 and  7, none were found to be extremely resistant (immune).However, 29 germplasms were observed to be resistant and 12 to be highly resistant.Eighteen genotypes demonstrated moderately susceptible, while the remaining twenty-five germplasms exhibited moderate resistance.The BFP05 genotype showed susceptible to PM disease.In order to better understand the level of genetic diversity and agronomic performance of resistant germplasm accessions, more information should be gathered 29,117,118 .The germplasm accessions, which have high levels of resistance and agronomic superiority, may minimize the time necessary to eradicate the undesired genes via repeated backcrossing by plant breeders.Three basic selection strategies including tandem selection, independent culling levels and index selection can be utilized for improvement of plants species in breeding programs.
Tandem selection attempts to improve a breeding germplasm for several traits by selecting one trait at one time for several generations, then another trait is focused on for next breeding cycle 119 .Identification of diversity serves two roles: it helps to organize the identity and integrity of an ex-situ collection and it also provides a structure accessing this diversity for breeding efforts 120 .PCA is a useful method for locating significant characteristics that have a bigger influence on the total variables, and each vector's coefficient indicates the percentage contribution of each original variable that each principal component is linked to Sanni et al. 121 .According to studies, the first three main components are often the most significant in showing the patterns of variation among the various genotypes and the traits linked to genotype differentiation.Raji 122 asserted that traits with coefficients greater than 0.3 (regardless of whether they are positive or negative) are important, while traits with coefficients less than 0.3 are thought to have the least impact on the overall variation seen.This methodology was used in the current study 123 .
Based on the dendrogram produced by cluster analysis and PCA, the 85 genotypes of field pea were divided into five groups (Tables 9 and 10 and Fig. 6).Sharma 103 determined a similar clustering pattern using PCA and hierarchical cluster analysis for 22 field pea genotypes.According to Euclidean distance, clusters mostly emerge following the origin of genotypes or geographical locations.Most genotypes with the same geographic origin are clustered together, although a smaller number of genotypes with different origins are also included in the same cluster.Understanding the link between variables may be aided by the use of multivariate statistical analysis, such as PCA.These might help determine the traits nature and simplify data collection 124 .The first four principal components in this study's PCA analysis of the nine quantitative characteristics explained 88.4% of all variance, indicating a very significant link between the features under investigation (Fig. 4a).The first PC was the most important, accounting for 30% of the variance on its own.Because of their significant loadings, the features of YPP, HSW, PH, and PPP were crucial in the first PC for distinguishing the genotypes (Fig. 4c).Similar findings were found in a field pea crop by Chowdhury and Mian 125 .A method of multivariate analysis known as PCA biplot combines characteristics and genotypes in two dimensions while eliminating overlapping variations from large, complicated data sets, making it easier to identify key figures for selection (Fig. 5).As a result, PCA revealed distinct trait differentiation as well as significant variation across the five groups of the 85 field pea genotypes (Fig. 5).The traits PPP, HSW, PH, and YPP significantly contributed to describing the variations among the BFP74, BFP78, BFP76, BFP80, BFP75, BFP45, and BFP73 genotypes, and as a result, it will be possible to improve the genetic diversity of field pea genotypes through selection using these traits.Singh et al. 113 claimed that while choosing the kind of cluster to utilize for further selection and the pattern to use for hybridization, greater focus should be placed on the characteristic that provides the greatest divergence.The correlation matrix of the observed qualities in Fig. 5 also confirmed the variability of traits determined by PCA biplot and cluster analysis.
One of the widely used statistical methods for categorizing items into groups that have a lot in common with other groups of objects is cluster analysis.The clusters will be useful for upcoming heterotic breeding since different sets of alleles may affect their traits and performance 125 .According to earlier publications by Prasad et al. 1 and Kumar et al. 126 , great variety was present in the material under evaluation, as evidenced by the discrimination of genotyping lines into so many distinct clusters.Table 9 displays the estimated intra and inter-cluster distances for five clusters.Cluster 5 had the largest intra-cluster distance, which was followed by clusters 2, 1, and 4, whereas clusters 2 and 3 had the greatest inter-cluster distance, which was followed by clusters 1 and 3 (Table 9).In cluster 4 with 8 genotypes, we observed that variables that contribute to PPP, PL, PH, and YPP were substantially higher (Table 10).The findings of this research showed that yield-related positive and significant traits had the ability to enhance seed production.Since they demonstrated a favorable and statistically significant link with grain production, these traits were taken into account throughout the selection process.Thus, the field pea hybridization program for a variety of applications might employ these genotypes of field pea as parental sources.
In conclusions, both multivariate statistical analysis tools showed the existence of wide genetic diversity among the landraces in the study for qualitative and quantitative characteristics during the years 2022 and 2023.The results of the current research showed that there was enough genetic variation within and between genotypes, suggesting the prospect of further genetic improvement in field pea yield and traits associated with yield.The studied qualitative and quantitative traits both revealed connections between one another.The findings of the current study showed that there was significant variation in both seed yield and its related traits and resistance to powdery mildew diseases, suggesting the possibility of selecting promising gene pools that could be used as https://doi.org/10.1038/s41598-024-69448-7

Figure 1 .
Figure 1.Flower color variation and variability of leaf and tendril structure of field pea genotypes.

Figure 2 .Table 4 .
Figure 2. Distribution of 85 field pea genotypes for the eight yields and yield contributing traits.

5 .Table 6 .
The average performance of the genotypes for the characters (Pooled values of 2022 and 2023).DF = days to 80% flowering, DM = days to maturity, PH = plant height, PPP = pods per plant, SPP = seeds per pod, PL = pod length (cm), HSW = hundred seed weight, PM = powdery mildew and YPP = yield per plant.The descriptive statistics and analysis of eight quantitative morphological traits of field pea genotypes.(Pooled values of 2022 and 2023).H' = Shannon Weaver diversity index, DF = days to 80% flowering, DM = days to maturity, PH = plant height, PPP = pods per plant, SPP = seeds per pod, PL = pod length (cm), HSW = hundred seed weight, PM = powdery mildew and YPP = yield per plant.

Figure 4 .
Figure 4. (a) Proportion of variance (%) of top 8 principal components (PCs), (b) Eigenvalues of top 8 PCs, (c) Contribution of variables to PC1 (%), and (d) Contribution of variables to PC2 (%) derived from principal component analysis (PCA).Red dashed lines across bar plots are the reference lines and the variable bars above the reference lines are considered-important in contributing to the respected PCs.

Figure 5 .
Figure5.shows a biplot of the PCA method that demonstrates the association between measured traits and field pea genotypes.Five groups of the 85 field pea genotypes are represented by different colors of the individuals (genotypes).The magnitude of the overall contribution of the variables to PC1 and PC2 is shown by the length and color intensity of the arrows.PC1 on the x-axis provided 30% of the overall variability, whereas PC2 on the y-axis contributed 23.9% of the total variability.

Figure 6 .
Figure 6.Agglomerative hierarchical clustering (AHC) dendrogram analysis using Euclidean distance into different clusters by the Ward method for quantitative morphological traits of 85 field pea genotype.

Figure 7 .Table 10 .
Figure 7.The grouping pattern of 85 field pea genotypes with 8 quantitative features is shown on the heatmap.Each row denotes a genotype, whereas each column denotes a character.Based on the link between the genotype and the characteristics, the different colors and intensities (-2 to 6) were modified.The colors red and green stand for lower values, blue for higher values, and green for mid-values.

Figure 8 .
Figure 8. Correlations among the traits scored; Pearson's rank correlation matrix and performance analytic chart of the variables showing the relationship among the variables scored on field pea accessions.PH = plant height, DF = days to 80% flowering, DM = days to maturity, PL = pod length (cm), PPP = pods per plant, SPP = seeds per pod, PM = Powdery mildew, HSW = hundred seed weight, and YPP = yield per plant.*, ** and *** = Significant at p < 0.05, p < 0.01 and p < 0.001, respectively.

Table 1 .
List of 85 studied field pea genotypes with source and their code.

Table 2 .
Table 5 displays the average values pi × log pi Disease scoring scale used for powdery mildew infection.

Table 7 .
Response of different field pea genotypes at different growth stages screened against powdery mildew (Erysiphe polygoni) (Pooled values of 2022 and 2023).

Table 8 .
Disease response, frequency, and percentage of the field pea genotypes screened against Erysiphe polygoni (Pooled values of 2022 and 2023).

Table 9 .
Average intra-and inter-cluster distances for five clusters in field pea genotype.